Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid
نویسنده
چکیده
Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classii-cation tasks even when the conditional independence assumption on which they are based is violated. However , most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decision-tree classiiers and Naive-Bayes classiiers: the decision-tree nodes contain uni-variate splits as regular decision-trees, but the leaves contain Naive-Bayesian classiiers. The approach retains the interpretability of Naive-Bayes and decision trees, while resulting in classiiers that frequently out-perform both constituents, especially in the larger databases tested.
منابع مشابه
Int Reduction
Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, N...
متن کاملScaling Up the Accuracy of Naive Bayes Classi ers a Decision Tree Hybrid
Naive Bayes induction algorithms were previously shown to be surprisingly accurate on many classi cation tasks even when the conditional independence assumption on which they are based is violated How ever most studies were done on small databases We show that in some larger databases the accuracy of Naive Bayes does not scale up as well as decision trees We then propose a new algorithm NBTree ...
متن کاملAttribute Normalization Techniques and Performance of Intrusion Classifiers: A Comparative Analysis
Network traffic have several attributes with different range of values. These attributes can be qualitative or quantitative in nature. Attributes with large values significantly influence the performance of intrusion classifier making it bias towards them. Attribute normalization eliminates such dominance of the attributes by scaling the values of all the attributes within a specific range. The...
متن کاملFast Perceptron Decision Tree Learning from Evolving Data Streams
Mining of data streams must balance three evaluation dimensions: accuracy, time and memory. Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees. In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron c...
متن کاملScaling Up the Accuracy of Decision-Tree Classifiers: A Naive-Bayes Combination
C4.5 and NB are two of the top 10 algorithms in data mining thanks to their simplicity, effectiveness, and efficiency. In order to integrate their advantages, NBTree builds a naive Bayes classifier on each leaf node of the built decision tree. NBTree significantly outperforms C4.5 and NB in terms of classification accuracy. However, it incurs very high time complexity. In this paper, we propose...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996